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Introduction 


Type systems are crucial tools in the hands of develop- 
ers. The compilers’ type checking is the first line of de- 
fence against programmer error. Unfortunately, develop- 
ers often not use type systems to their capacity. This is 
most prevalent with trivial values expressed by fundamen- 
tal (e.g. int) or library (e.g. string) types. Developers 
often resort to embedding the meaning of a variable's ex- 
pressed value in its name [1] and not its type. Perhaps the 
most widely-known example of such weak interfaces is the 
Mars Climate Orbiter incident [2]. Due to a design misun- 
derstanding, some program modules used different units 
of measurement conceptually while only communicating in 
terms of “numbers’, which resulted in the incident. Com- 
pilers can catch such mistakes, but only if the program 
uses types more elaborate than double. 

Type migration is the process of changing the types of 
program elements. Conventionally, one would design the 
new types in advance, specify a mapping from old types 
to the new ones, and perform the migration. If some op- 
erations are left undefined, the code does not compile, 
making the code incomprehensible to developer tools. 
Instead, we propose an approach which combines code 
comprehension and type migration into the same process. 
By utilising the type system of an existing language but 
in an orthogonal plane, we are able to interactively walk 
the developer through the discovery of how a newly intro- 


duced type should look like. 


Fictive Types 


Our approach uses the fictive types (‘ft’) annotation 
technique to embed additional information about program 
elements into the source code. The highlight of anno- 
tations is that compiler and tool vendors are allowed to 
specify their own set. Tools are encouraged to ignore 
— maybe with an accompanying warning — annotations 
they do not understand. The following example shows 
a local variable whose “real” type is int and fictive type 
temperature. Existing compilers and tools interact with 
the real type only, and the project can be continually re- 
leased, while tools more aware interact with the fictive 
type. Fictive types express only the “set membership’ re- 


lation — “SensorTemp is-a temperature. 


[lft (temperature)]] int SensorTemp; 


I(n)tera(c)tive refactoring process 


The refactoring consists of three steps. The propagations 
step are executed in a saturating fix-point iteration, where 
the developer is asked on-demand to provide additional in- 


put. 


Taint seeding 


Consider the previous example wherein the developer de- 
cided some variable is-a “temperature. This variable is 
passed to a function somewhere, and that function re- 


turns triple the value, and this is emitted to some output. 


int threshold(int T) { return 3 * T; } 
int T2 = threshold(SensorTemp); 
write(T2); 


Code analysis and transformation is executed on the ab- 
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stract syntax tree (AST). A simplified AST for the previ- 


ous example — with the type colouring — is shown below. 


fn threshold |}<~ (a — 


ie call threshold | 
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Propagation 


The propagation is executed via a modified compiler built 
upon the LLVM Compiler Infrastructure's Clang project. It 
is a monotonic operation where more program elements re- 
ceive a taint. In Round 1, the propagation tool discovers 
that the fictively typed variable is passed to the function 
parameter — a trivial propagation via assignment. This 
turns threshold into a function taking “temperature . 
As each round is as expansive as possible, the type expres- 
sion <?> * temperature is investigated, where a recov- 


erable error is generated for the undefined operation. 


test.cpp:1:32: warning: use of undefined fictive 
operator '<?> * temperature' 
3 * ‘TT; 


~ 
ro 


test.cpp:1:30: note: left operand is ‘int’ literal, 


configure overload or refactor into variable 


Errors are recoverable by the developer changing the pa- 
rameters of the environment, which in practice is done via 
an interactive configuration file. If the developer, for ex- 
ample, decides to define 3 to be of fictive type factor, 
and that factor * temperature is temperature, the 


code is transformed, and the next round begins. 


int threshold([[ft(temperature)]] int T) { 
[[ft(factor)]] int F = 3; 
return F * T; 
$ 
int T2 = threshold(SensorTemp); 
write(T2); 


Round 2 associates the result of the * operation with 
the return value of the function, and the assignment of 
the function's result to the local variable results in taint- 
ing the local variable. The user at this point could decide 
that write is a library function which they do not wish to 
change the type of. This results in an explicit type cast 
away from the new type. As there are no more production 


rules to take, the algorithm terminates successfully. 


[lft (temperature)]] int 
threshold([l[ft(temperature)]] int T) { 
[[ft(factor)]] int F = 3; 
return F * T; 
t 
[lft (temperature)]] int T2 = 
threshold(SensorTemp); 
write(T2); 
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fn threshold | frei 
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call threshold | 

















If an irrecoverable error is discovered, the algorithm termi- 
nates with the error. For example, a function containing 
two return statements that return values of different fic- 


tive types would classify as an irrecoverable error. 


Transition to strong type 


After the successful propagation, the tool generates strong 
types for the fictive types used and rewrites the code to 
use the new types. Developers are then encouraged to add 
invariants, additional semantics, or apply existing refac- 
toring operations on these types. The rewritten variables 


explicitly wrap and unwrap at interface boundaries [3]. 


class temperature { /*... */ }; 

class factor { /* ... */ }; 

temperature operator *(factor, temperature); 

temperature threshold(temperature T) { 
factor F{3}; // Explicit cast from int literal. 
return F * T; 

t 

temperature SensorTemp = /*... */; 

temperature T2 = threshold(SensorTemp); 

print (static_cast<int>(T2)); 


The project is now enhanced with increased type safety. 
Importantly, future development efforts that would violate 
the interface discovered for this type will result in an error 


message from any standard-compliant compiler. 


some_function(T2 + 1.5); 


test.cpp:11:18: error: invalid operands to binary 
'+' expression ('temperature' and 'double') 
some function(T2 + 1.5); 
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